Evaluating prosody in synthetic speech with online (eye-tracking) and offline (rating) methods
نویسندگان
چکیده
This study examines the relationship between online processing effects observed in earlier eye-tracking experiments [1, 2] and offline quality ratings gathered for the synthetic and natural speech stimuli used in these experiments, along with their acoustic-prosodic properties. White et al. [2] reported that even high-quality synthetic speech failed to replicate the facilitative effect of contextually appropriate accent patterns found with human speech, while it produced a more robust intonational garden-path effect with contextually inappropriate patterns. They conjectured that both of these effects could be due to processing delays observed with the synthetic speech. In this paper, we present an acoustic analysis of the stimuli used in the eye-tracking experiments and an offline stimuli rating task, which was designed to investigate whether a context-independent measure of utterance quality could predict processing-based effects. The analysis reveals that for synthetic speech, longer adjectives—which provide more processing time—do facilitate anticipatory looks to the target. Larger values of F0 drop (difference between the F0 values of the adjective and following noun) also negatively influenced looks to the target and were negatively correlated with offline ratings, suggesting that this may be a specific acoustic factor that merits attention in future work on improving synthesis quality. Finally, the study shows that online measures of unconscious processing and offline measures of conscious judgments, taken together, can provide a more comprehensive evaluation of synthetic speech than either method alone.
منابع مشابه
Eye tracking for the online evaluation of prosody in speech synthesis: not so fast!
This paper presents an eye-tracking experiment comparing the processing of different accent patterns in unit selection synthesis and human speech. The synthetic speech results failed to replicate the facilitative effect of contextually appropriate accent patterns found with human speech, while producing a more robust intonational garden-path effect with contextually inappropriate patterns, both...
متن کاملToward Emotionally Accessible Massive Open Online Courses (MOOCs)
This paper outlines an approach to evaluating the emotional content of three Massive Open Online Courses (MOOCs) using the affective computing approach of prosody detection on two different text-to-speech voices in conjunction with human raters judging the emotional content of course text. The intent of this work is to establish the potential variation on the emotional delivery of MOOC material...
متن کاملUsing eye movements for online evaluation of speech synthesis
This paper describes an eye tracking experiment to study the processing of diphone synthesis, unit selection synthesis, and human speech taking segmental and suprasegmental speech quality into account. The results showed that both factors influenced the processing of human and synthetic speech, and confirmed that eye tracking is a promising albeit time consuming research method to evaluate synt...
متن کاملThe online evaluation of speech synthesis using eye movements
This paper describes an eye tracking experiment to study the processing of diphone synthesis, unit selection synthesis, and human speech taking segmental and suprasegmental speech quality into account. The results showed that both factors influenced the processing of human and synthetic speech, and confirmed that eye tracking is a promising albeit time consuming research method to evaluate synt...
متن کاملApplication of eye tracking technology for screening and rehabilitation of children with special needs.
Eye tracking is a technology for measuring eye movements when looking at a point. The length of time to look at a particular situation, the sequence of observations, as well as points of attention are processed in the process of eye tracking and eye movements are identified by topics such as fixation, saccade, etc. Information about how the eyes move at a special moment has many potential appli...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010